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Abstract. In mean-payoff games, the objective of the protagonist is to ensure that the limit average 
of an infinite sequence of numeric weights is nonnegative. In energy games, the objective is to ensure 
that the running sum of weights is always nonnegative. Multi-mean-payoff and multi-energy games 
replace individual weights by tuples, and the limit average (resp. running sum) of each coordinate must 
be (resp. remain) nonnegative. These games have applications in the synthesis of resource- bounded 
processes with multiple resources. 

We prove the finite-memory determinacy of multi-energy games and show the inter-reducibility of multi- 
mean-payoff and multi-energy games for finite-memory strategies. We also improve the computational 
complexity for solving both classes of games with finite-memory strategies: while the previously best 
known upper bound was EXPSPACE, and no lower bound was known, we give an optimal coNP- 
complcte bound. For memoryless strategies, we show that the problem of deciding the existence of a 
winning strategy for the protagonist is NP-complete. Finally we present the first solution of multi-mean- 
payoff games with infinite-memory strategies. We show that multi-mean-payoff games with mean-payoff- 
sup objectives can be decided in NP n coNP, whereas multi-mean-payoff games with mean-payoff-inf 
objectives are coNP-complete. 

Keywords: Games on graphs; mean-payoff objectives; energy objectives; multi- dimensional objec- 
tives. 

1 Introduction 

Graph games and multi- objectives. Two-player games on graphs are central in many applications of 
computer science. For example, in the synthesis problem, implementations of reactive systems are 
obtained from winning strategies in games with a qualitative objective formalized by an w-regular 
specification [22, 21, 1]. In these applications, the games have a qualitative (boolean) objective that 
determines which player wins. On the other hand, games with quantitative objective which are 
natural models in economics (where players have to optimize a real-valued payoff) have also been 
studied in the context of automated design [23, 9, 24]. In the recent past, there has been considerable 
interest in the design of reactive systems that work in resource-constrained environments (such as 

* Preliminary versions appeared in the Proceedings of the IARCS Annual Conference on Foundations of Software 
Technology and Theoretical Computer Science (FSTTCS), Schloss Dagstuhl - Leibniz- Zentrum fuer Informatik, 
LIPIcs, 2010, pp. 505-516, and in the Proceedings of the 14th International Conference on Foundations of Software 
Science and Computational Structures (FoSSaCS), Lecture Notes in Computer Science 6604, Springer, 2011, pp. 
275-289. 

** Corresponding author: Laurent Doyen; address: LSV - ENS Cachan, 61 av. du President Wilson, 94235 Cachan 
Cedex, France; email: doyen@lsv.ens-cachan.fr. 



embedded systems). The specifications for such reactive systems are quantitative, and give rise to 
quantitative games. In most system design problems, there is no unique objective to be optimized, 
but multiple, potentially conflicting objectives. For example, in designing a computer system, one is 
interested not only in minimizing the average response time but also the average power consumption. 
In this work we study such multi-objective generalizations of the two most widely used quantitative 
objectives in games, namely, mean-payoff and energy objectives [11,24,6,3]. 

Multi-mean-payoff games. A multi-mean-payoff game is played on a finite weighted game graph by 
two players. The vertices of the game graph are partitioned into positions that belong to player 1 and 
positions that belong to player 2. Edges of the graphs are labeled with fc-dimensional vectors w of 
integer values, i.e., w € 2, k . The game is played as follows. A pebble is placed on a designated initial 
vertex of the game graph. The game is played in rounds in which the player owning the position 
where the pebble lies moves the pebble to an adjacent position of the graph using an outgoing edge. 
The game is played for an infinite number of rounds, resulting in an infinite path through the graph, 
called a play. The value associated to a play is the mean value in each dimension of the vectors of 
weights labeling the edges of the play. Accordingly, the winning condition for player 1 is defined by 
a vector of rational values v £ Q fc that specifies a threshold for each dimension. A play is winning 
for player 1 if its vector of mean values is at least v. All other plays are winning for player 2, thus the 
game is zero-sum. We are interested in the problem of deciding the existence of a winning strategy 
for player 1 in multi-mean-payoff games. In general infinite memory may be required to win multi- 
mean-payoff games, but in many practical applications such as the synthesis of reactive systems 
with multiple resource constraints, the multi- mean-payoff games with finite memory is the relevant 
problem. Also they provide the framework for the synthesis of specifications defined by mean- 
payoff conditions [2,8], and the synthesis question for such specifications under regular (ultimately 
periodic) words correspond to multi- mean-payoff games with finite- memory strategies. Hence we 
study multi-mean-payoff games both for general strategies as well as finite-memory strategies. 

Multi-energy games. In multi-energy games, the winning condition for player 1 requires that, given 
an initial credit vq £ N fc , the sum of vq and all the vectors labeling edges up to position i in the 
play is nonnegative, for all i 6 N. The decision problem for multi-energy games asks whether there 
exists an initial credit vq and a strategy for player 1 to maintain the energy nonnegative in all 
dimensions against all strategies of player 2. 

Contributions. In this paper, we study the strategy complexity and computational complexity of 
solving multi-mean-payoff and multi-energy games. The contributions are as follows. 

First, we show that multi-energy and multi-mean-payoff games are determined when played with 
finite-memory strategies. When considering finite-memory strategies, those games correspond to the 
synthesis question with ultimately periodic words, and they enjoy pleasant mathematical properties 
like existence of the limit of the mean value of the weights. We also establish that multi-energy and 
multi-mean-payoff games are not determined for memoryless strategies. Additionally, we show for 
multi-energy games determinacy under finite-memory coincides with determinacy under arbitrary 
strategies, and each player has a winning strategy if and only if he has a finite-memory winning 
strategy. In contrast, we show for multi- mean-payoff games that determinacy under finite- memory 
and determinacy under arbitrary strategies do not coincide. Moreover, for multi-mean-payoff games 
when the strategies for player 1 is restricted to finite- memory strategies, the winning set for player 1 
remains unchanged irrespective of whether we consider finite-memory or infinite-memory counter 
strategies for player 2. 
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Second, we show that under the hypothesis that both players play either finite-memory or both 
play memoryless strategies, the decision problems for multi-mean-payoff games and multi-energy 
games are equivalent. 

Third, we study the computational complexity of the decision problems for multi- mean-payoff 
games and multi-energy games, both for finite-memory strategies and the special case of memoryless 
strategies. Our complexity results can be summarized as follows. (A) For finite-memory strategies, 
we provide a nondeterministic polynomial-time algorithm for deciding negative instances of the 
problems 5 . Thus we show that the decision problems are in coNP. This significantly improves the 
complexity as compared to the EXPSPACE algorithm that can be obtained by reduction to Vass 
(vector addition systems with states) [4]. Furthermore, we establish a coNP lower bound for these 
problems by reduction from the complement of the 3SAT problem, hence showing that the problem 
is coNP-complete. (B) For the case of memoryless strategies, as the games are not determined, 
we consider the problem of determining if player 1 has a memoryless winning strategy. First, we 
show that the problem of determining if player 1 has a memoryless winning strategy is in NP, and 
then show that the problem is NP-hard even when the weights are restricted to {—1, 0, 1} and in 
dimension 2. 

Finally, we study the computational complexity of multi-mean-payoff games for infinite-memory 
strategies. Our complexity results are summarized as follows. (A) We show that multi-mean-payoff 
games with mean-payoff-sup objectives can be decided in NP n coNP (in the same complexity as for 
games with single mean-payoff objectives). Moreover, we also show that if mean-payoff games with 
single mean-payoff objective can be solved in polynomial time, then multi-mean-payoff games with 
mean-payoff-sup objectives can also be solved in polynomial time. (B) Multi-mean-payoff games 
with mean-payoff-inf objectives are coNP-complete. (C) Finally, we show that multi-mean-payoff 
games with combination of mean-payoff-sup and mean-payoff-inf objectives are also coNP-complete. 

In summary, our results establish optimal computational complexity results for multi-mean- 
payoff and multi-energy games under finite-memory, memoryless and infinite-memory strategies. 

Related works. Mean-payoff games, which are the one-dimension version of our multi- mean-payoff 
games, have been extensively studied starting with the works of Ehrenfeucht and Mycielski in [11] 
where they prove memoryless determinacy for these games. Because of memoryless determinacy, it is 
easy to show that the decision problem for mean-payoff games lies in NP n coNP, but despite large 
research efforts, no polynomial time algorithm is known for that problem. A pseudo-polynomial 
time algorithm has been proposed by Zwick and Paterson in [24], and improved in [5]. The one- 
dimension special case of multi-energy games have been introduced in [6] and further studied in [3] 
where log-space equivalence with classical mean-payoff games is established. 

Multi-energy games can be viewed as games played on Vass (vector addition systems with 
states) where the objective is to avoid unbounded decreasing of the counters. A solution to such 
games on Vass is provided in [4] (see in particular Lemma 3.4 in [4]) with a PSPACE algorithm 
when the weights are {—1, 0, 1}, leading to an EXPSPACE algorithm when the weights are arbitrary 
integers. We drastically improve the EXPSPACE upper-bound by providing a coNP algorithm for 
the problem, and we also provide a coNP lower bound even when the weights are restricted to 
{— 1, 0, 1}. Finally the work in [12] considers multi-dimension energy games with fixed initial credit, 
as well as variants of energy games with upper and lower energy bounds. 



5 Negative instances are those where player 1 is losing, and by determinacy under finite-memory where player 2 is 
winning. 



3 



2 Definitions 



Well quasi-orders. A relation ■< over a set D is a u>eZZ quasi-order if the following conditions hold: 
(a) H is transitive and reflexive, and (b) for all / : N — > D, there exist ii,i2 G N such that i\ < 12 
and f(ii) ^ /fe)- It is known that (N fc , <) is a well quasi-order and that the Cartesian product of 
two well quasi-ordered sets is a well quasi-ordered set [10]. 

Multi-weighted two-player game structures. A multi-weighted two-player game structure (or 
simply a game) is a tuple G = (S±, S2, E, w) where Si n S2 = 0, and Si (i = 1, 2) is the finite set of 
player-i states (we denote by S = S\ U 5*2 the state space), E C S x S is the set of edges such that 
for all s G 5, there exists s' £ S such that (s, s') 6 and w : E — > Z fe is the multi-weight labeling 
function. The parameter S N is the dimension of the multi- weights. The game G is a one-player 
game if S 2 = 0. The subgraph of G induced by a set T C S 1 is G [ T = (SiDT, 5 2 nT, £fl(TxT), w). 
Note that G \ T is a game structure if for all s£T, there exists s'fT such that (s, s') G £\ 

A pZay in G from an initial state Sj n i t G S is an infinite sequence ir = sqSi . . . s n . . . of states 
such that (i) so = s- in \ t , and (ii) (sj, Si + ±) G E 1 for all i > 0. The prefix of length n of 7r is the finite 
sequence 7r(n) = so^i ■ ■ ■ s n , its last element s n is denoted Last(7r(n)) and its length |7r(n)|. The set 
of all plays in G is denoted Plays(G). 

The energy level vector of a play prefix p = sqS\ . . . s n is EL(p) = X^=o _1 w ( s ii an d the 

mean-payoff vectors of a play ir = sqSi . . . s n . . . are defined as follows (in dimension 1 < j < k): 
MP(vr) j = limsup,,^ \ ■ EL(7r(n))j, and MP(vr) i = liminf^oo \ ■ EL(7r(n)) i . 

Strategies. A strategy of player i (i £ {1,2}) in G is a function Aj : S* ■ Si —> S such that 
(s, Aj(p ■ s)) £ E for all p G 5** and all s £ 5j. A play 7r = so s i • • • G Plays(G) is consistent with a 
strategy A.; of player i if Sj_|_i = Ai(soSi • • • Sj) for all j > such that Sj G Si. The outcome from 
a state Si n j t of a pair of strategies, Ai for player 1 and A2 for player 2, is the (unique) play from 
Si n i t that is consistent with both Ai and A2. We denote outcomeG^init, Ai, A2) this play. We denote 
by Tx^s,^) the strategy tree obtained as the unfolding of the game G from Sj n i t when strategy A^ 
is used. The nodes of this tree are all prefixes of the plays from S| n i t that are consistent with the 
strategy Aj of player i. 

A strategy Aj for player i uses finite-memory if it can be encoded by a deterministic Moore 
machine (M,mo,a u ,a n ) where M is a finite set of states (the memory of the strategy), mo G M 
is the initial memory state, a u : M x S — > M is an update function, and a n : M x Si — > S is the 
next-action function. If the game is in a player-i state s G Si and m G M is the current memory 
value, then the strategy chooses s' = a n (m,s) as the next state and the memory is updated to 
a u (m,s). Formally, (M,mo, a u , a n ) defines the strategy A such that A(p • s) = a n (a u (mo, p), s) 
for all p G S* and s G Si, where a u extends a u to sequences of states as usual. The strategy is 
memoryless if \M\ = 1. Given an initial state Si n j t and a finite-memory strategy \ of player i, let 

iOinit) ^ e graph obtained as the product of G with the Moore machine defining Aj, with initial 
vertex (mo, Si n i t ) and where ((m, s), (m/, s'}) is a transition in the graph if m' = a u (m, s), and either 
s G Si and s' = a n (m, s), or s G S^^i and (s, s') G E. 

Objectives. An objective for player 1 in G is a set of plays (p C Plays(G). Given a game G, an 
initial state sq, and an objective (p, we say that a strategy Ai is winning for player 1 from sq if 
for all plays ir G Plays(G) from so that are consistent with Ai, we have that it G <p\ and we say 
that a strategy A2 is winning for player 2 from so if f° r au plays in ir G Plays(G) from so that are 
consistent with A2, we have that it ip. We denote by ((l))p the set of states so such that there 
exists a winning strategy for player 1 from sq, and by ((2))—xp the set of states sq such that there 
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exists a winning strategy for player 2 from so- Note that H ((2))— = by definition. We 

consider the following objectives: 

— Energy objectives. Given an initial energy vector v$ G N k , the multi-energy objective 
PosEnergy G (wo) = {vr € Plays(G) | Vn > : vo + EL(7r(n)) > {0} k } requires that the energy level 
in all dimensions remain always nonnegative. 

— Mean-payoff objectives. Given two sets I,, J C the multi-mean-payoff objective 
MeanPayofflnfSup G (I, J) = {vr 6 Plays(G) | V* G I : MP(tt),- > A Vj £ J : MP(7r)j > 0} 
requires for all dimensions in / the mean-payoff-inf value be nonnegative, and for all dimensions 
in J the mean-payoff-sup value be nonnegative. 

When the game G is clear from the context we omit the subscript in objective names. Note 
that arbitrary thresholds | £ Q can be considered in the multi-mean-payoff objectives because the 
mean-payoff value computed according to the weight function w is greater than ^ if and only if the 
mean-payoff value according to the weight function b ■ w — a is greater than where (b-w — a)(e) = 
b ■ w(e) — a for all e G E. For the special case of I = and J = {1, . . . , k}, we denote by 
MeanPayoffSup = MeanPayofflnfSup(0, J) the conjunction of all mean-payoff-sup objectives, and for 
I = {1, . . . , k} and J = we denote by MeanPayofflnf = MeanPayofflnfSup(7, 0) the conjunction of 
all mean-payoff-inf objectives. We denote by MeanPayoffSupj = MeanPayofflnfSup(0, {i}) the single 
mean-payoff-sup objective in dimension 1 < i < k. 

Decision problems. We consider the following decision problems: 

— The unknown initial credit problem asks, given a multi-weighted two-player game structure G, 
and an initial state sq, to decide whether there exist an initial credit vector vq G N fc and a 
winning strategy Ai for player 1 from sq for the objective PosEnergy G (-uo). 

— The mean-payoff threshold problem asks, given a multi-weighted two-player game structure G, 
an initial state so, and two sets I,J C {1, . . . , k} of indices, to decide whether there exists a 
winning strategy Ai for player 1 from so for the objective MeanPayofflnfSup G (7, J). 

Determinacy, determinacy under finite-memory, and determinacy by finite-memory. 

We now define the notion of determinacy, determinacy under finite-memory and determinacy by 
finite-memory. 

— (Determinacy). A game G with state space S and objective ip is determined if from all states 
so € S, either player 1 or player 2 has a winning strategy, i.e. S = ((l))ip\J {{2))-xp. Observe that 
since H {{2}}^ip = 0, determinacy means that and ({2}}-up partition the state space. 

— (Determinacy under finite-memory). We also consider determinacy under finite-memory strate- 
gies. Let ((l))f imte tp be the set of states sq from which player 1 has a finite-memory strategy 
Ai such that for all finite-memory strategies A2 of player 2, we have outcomec^so, Ai, A2) G 92. 
And let ((2))^ 4mte -i<^ be the set of states so from which player 1 has a finite-memory strategy A2 
such that for all finite-memory strategies Ai of player 1, we have outcomeG , (so 5 Ai, A2) 

A game G with state space 5 and objective ip is determined under finite-memory if S = 
{{l))f imte ip U ((2))f mite ^tp. Again observe that ((l))f inite tp n ((2))f inite ^<p = 0, and determinacy 
under finite-memory means that ((l))f intte ip and ((2))^ mte -i<^ partition the state space. We say 
that determinacy and determinacy under finite-memory coincide for an objective 99, if for all 
game structures, we have = ((l))f inite (p and ((2)) -up = ((2))f mtte ^(p. 
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— (Determinacy by finite-memory) . We also consider determinacy by finite-memory strategies. 
Let ((l))f in ~ in f tp be the set of states sq from which player 1 has a finite-memory strategy Ai 
such that for all strategies A2 of player 2, we have outcomeG'(so, Ai, A2) G f (i.e., player 1 is 
restricted to finite-memory strategies whereas strategies for player 2 are general infinite-memory 
strategies). The set of states sq from which player 2 has a finite-memory strategy A2 such that 
for all strategies Ai of player 1, we have outcomeG(so> Ai, A2) €: is denoted ((2))f in ~ m f -xp. If for 
all game structures we have ((l))y? = {{l))f m ~ m f p and ((2))-i^> = ^2))* m-w * -up, and all game 
structures with objective ip are determined, then we say that determinacy by finite-memory 
strategies holds for tp. 

We first observe that determinacy by finite-memory strategies implies that finite-memory 
strategies suffice for both players, and determinacy by finite-memory implies determinacy under 
finite-memory (since given a finite-memory strategy of a player, if there is a counter strategy 
for the opponent, then there is a finite- memory one by determinacy by finite-memory). Thus de- 
terminacy by finite-memory strategies implies that (i) {(l))p = ((l))f mite p = ((\}y in ~ in f p- and 
(ii) ((2))->(p = ((2))^ mte -ic/? = ((2))^ m-m ^-ic/?. As we will show that determinacy and determinacy 
under finite- memory do not coincide for multi- mean-payoff games (Theorem 5), we consider for 
multi-mean-payoff objectives p both (1) winning under finite-memory strategies, i.e. to decide 
whether sq G i(\))f mite p for a given initial state soi an d (2) winning under general strategies, i.e. 
to decide whether sq G ((l))<p for a given initial state so- For multi-energy games we will show 
determinacy by finite-memory strategies. 

Determinacy for multi-mean-payoff and multi-energy objectives follows from a general determi- 
nacy result for Borel objectives [19]: (a) multi-mean-payoff objectives can be expressed as a finite 
intersection of one-dimensional mean-payoff objectives which are complete for the third level of 
the Borel hierarchy [7]; and (b) multi-energy objectives can be expressed as a finite intersection of 
one-dimensional energy objectives which are closed sets. 

Theorem 1 (Determinacy [19]). Multi-mean-payoff and multi-energy games are determined. 

Attractors. The player-1 attractor of a given set T C S of target states is the set of states from 
which player 1 can force to eventually reach a state in T. The attractor is defined inductively as 
follows: let Aq = T, and for all j > let 

Aj+i = Aj U {s G Si I 3(s, t) G E :t G A, } U {s G 5 2 | V(s, t) G E : t G Aj} 

denote the set of states from where player 1 can ensure to reach Aj within one step irrespective of 
the choice of player 2. Then the player-1 attractor is Attri(T) = (J J>0 Aj. The player-2 attractor 
Attr 2 (T) is defined symmetrically. Note that for i = 1,2, the subgraph G \ (S \ Attr^(T)) is again a 
game structure (i.e., every state has an outgoing edge). For all multi-mean-payoff objectives p (and 
in general for all tail objectives [7]), we have ((l))p = Attri(((l))y>) and ((2)) -up = Attr2(((2))-i^>). 

3 Multi-Energy Games 

In this section, we study the determinacy and complexity of multi-energy games. First, we show 
that finite-memory strategies are sufficient for player 1, and memoryless strategies are sufficient 
for player 2. It follows that multi-energy games are determined under finite- memory. We establish 
coNP complexity for the unknown initial credit problem, as well as a matching coNP-hardness 
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result, and we show that under memoryless strategies for player 1 the problem is NP-complete. 
Finally, we show that the unknown initial credit problem is log-space equivalent to the mean-payoff 
threshold problem when the players have to use finite-memory strategies (and in general infinite- 
memory strategies are more powerful than finite-memory strategies in multi-mean-payoff games). 
The case of infinite-memory strategies in multi-mean-payoff games is addressed in Section 4. 

Determinacy under finite-memory. The next lemmas show that finite-memory strategies are 
sufficient for player 1 in multi-energy games, and that memoryless strategies are sufficient for 
player 2. 

Lemma 1. For all multi-weighted two-player game structures G and initial states so, the an- 
swer to the unknown initial credit problem is Yes if and only if there exist an initial credit 
vo G N fc and a finite-memory strategy A™ for player 1 such that for all strategies A2 of player 2, 
outcomec^o, A™, A 2 ) G PosEnergy G (uo)- 

Proof One direction is trivial. For the other direction, assume that Ai is a (not necessary finite- 
memory) winning strategy for player f in G from sq with initial credit vq G N k . We show how to 
construct from Ai a finite-memory strategy A F that is winning from sq against all strategies of 
player 2 for initial credit vq. 

Consider the strategy tree Twsq) and associate to each node p = sqSi . . . s n in this tree the 
energy vector vq + EL(p). Since Ai is winning, we have vq + EL(p) G N fc for all p G TW S0 ). Now, 
consider the relation C on the set S x N fc defined as follows: (s\,vi) C (52,^2) if $i = S2 and 
^i < V2 (i.e., v\{i) < V2(i) for all i, 1 < i < k). The relation C is a well quasi-order. As a 
consequence, on every infinite branch ir = sq^i ■ ■ ■ s n . . . of Tai(s ) there exist two indices i < j 
such that Last(7r(z)) = Last(7r(j)) and EL(7r(z)) < EL(7r(j)). We say that node ir(j) subsumes node 
7r(i). Now, let T FM be the tree Iw So ) where we stop each branch when we reach a node n2 that 
subsumes one of its ancestor node n\. By Konig's lemma [16] and Dickson's lemma [10], the tree 
T FM is finite. From the node ri2, player 1 can mimic the strategy played in n\ because the energy 
level in n2 is greater than in n\. From T FM , we can construct the Moore machine of a finite- memory 
strategy A FM that is winning in the multi-energy game G from sq with initial energy level vq. □ 

Lemma 2 ([4]). For all multi-weighted two-player game structures G and initial states sq, the 
answer to the unknown initial credit problem is No if and only if there exists a memoryless strategy 
A 2 for player 2, such that for all initial credit vectors vq G N fe and all strategies X\ for player 1 we 
have outcomeG(so, Ai, A2) PosEnergy G (^o). 

Proof. The proof was given in [4, Lemma 19]. Intuitively, consider a player-2 state S6S2 with two 
successors s' and s". If an initial credit vector v'q is sufficient for player 1 to win from Si n j t against 
player 2 always choosing s', and v'q is sufficient from s against player 2 always choosing s" , then 
v'q + v' ' is sufficient from Sj n j t against player 2 arbitrarily alternating between s' and s". This is 
because if player 1 maintains the energy nonnegative in all dimensions when the initial credit is vq, 
then he can maintain the energy always above A when initial credit is vq + A (A G N fc ). □ 

The previous two lemmas establishes both determinacy by finite-memory strategies, as well as that 
determinacy and determinacy under finite-memory coincide. As a consequence of the previous two 
lemmas, we get the following theorem. 
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Fig. 1. player 1 (round states) wins with initial credit (2,0) when player 2 (square states) can use memoryless 
strategies, but not when player 2 can use arbitrary strategies. 



Theorem 2. Multi-energy games are determined by finite-memory, hence determined under finite- 
memory. Determinacy coincides with determinacy under finite-memory for multi-energy games. 

Remark 1. Note that even if player 2 can be restricted to play memoryless strategies in multi- 
energy games, it may be that player 1 is winning with some initial credit vector vq when player 2 
is memoryless, and is not winning with the same initial credit vector vq when player 2 can use 
arbitrary strategies. This situation is illustrated in Fig. 1 where player 1 (owning round states) can 
maintain the energy nonnegative in all dimensions with initial credit (2, 0) when player 2 (owning 
square states) is memoryless. Indeed, either player 2 chooses the left edge from so to s\ and player 1 
wins, or player 2 chooses the right edge from sq to S2, and player 1 wins as well by alternating the 
edges back to sq. Now, if player 2 has memory, then player 2 wins by choosing first the right edge 
to S2, which forces player 1 to come back to so with multi- weight (—1, 1). The energy level is now 
(1,1) in so and player 2 chooses the left edge to si which is losing for player 1. Note that player 1 
wins with initial credit (2, 1) and (3, 0) (or any larger credit) against all arbitrary strategies of 
player 2. 

Complexity. We show that the unknown initial credit problem is coNP-complete. First, we show 
that the one-player version of this game can be solved by checking the existence of a circuit (i.e., a 
not necessarily simple cycle) with nonnegative effect in all dimensions, and we use the memoryless 
result for player 2 (Lemma 2) to define a coNP algorithm. Second, we present a coNP-hardness 
proof. 

Theorem 3. The unknown initial credit problem is coNP- complete. 

First, we need the following result about zero-circuits in multi- weighted directed graphs (a graph 
is a one-player game). A zero- circuit is a finite sequence sqS\ . . . s n with n > 1 such that so = s n , 
(si, Si + i) G E for all < i < n, and Y^Zq w(si, Sj+i) = (0, 0, . . . , 0). The circuit need not be simple. 

Lemma 3 ([18]). Deciding if a multi-weighted directed graph contains a zero circuit can be done 
in polynomial time. 

The result of Theorem 3 follows from the next two lemmas. 
Lemma 4. The unknown initial credit problem is in coNP. 

Proof. Let G be a multi-weighted two-player game structure, and sq be an initial state. By Lemma 2, 
we know that player 2 can be restricted to play memoryless strategies. A coNP algorithm guesses 
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a memoryless strategy A2 and checks in polynomial time that it is winning for player 2 using the 
following argument. 

Consider the graph Ga 2 (s ) as a one-player game (in which all states belong to player 1). We 
show that if there exists an initial energy level vq and an infinite play tt = sqSi . . . s n . . . in Gx 2 f So ) 
such that tt E PosEnergy(uo), then there exists a reachable circuit in Gx 2 f So ) with nonnegative 
effect in all dimensions. To show this, we extend tt with the energy level as follows: let tt' = 
(so, Wq)(si, w\) . . . (s n , w n ) . . . where w$ = vq and for all i > 1, Wi = Vq + EL(7r(i)). Since tt € 
PosEnergy(uo), we know that Wi £ N fc for all i > 0. Hence the following order defined on the pairs 
(s, w) 6 S x N fc is a well quasi-order: (s, w) C (s', w') if s = s' and w(j) < w'{j) for all 1 < j < k. It 
follows that there exist two indices i\ < 12 in tt' such that (s^^w^) C (sj 2 , Wi 2 ), and the underlying 
circuit through S{ 1 = Sj 2 has nonnegative effect in all dimensions. 

Based on this, we can decide if there exists an initial energy vector vq and an infinite path in 
Gx 2 ( So ) that satisfies PosEnergy G (uo) using the result of Lemma 3 on modified version of Gx 2 ( So ) 
obtained as follows. In every state of G Az ( S0 ), we add k selfdoops with respective multi- weight 
(— 1, 0, . . . , 0), (0, —1,0, . . . , 0), . . . , (0, . . . , 0, —1), i.e. each self-loop removes one unit of energy in 
one dimension. It is easy to see that G\ 2 ( SQ ^ has a circuit with nonnegative effect in all dimensions 
if and only if the modified G^ 2 ( S0 ) has a zero circuit, which can be determined in polynomial time. 
The result follows. □ 

Lemma 5. The unknown initial credit problem is coNP-hard. 

Proof. We present a reduction from the complement of the 3SAT problem which is NP- 
complete [20]. 

Reduction. We show that the unknown initial credit problem for multi-weighted two-player game 
structures is at least as hard as deciding whether a 3SAT formula is unsatisfiable. Consider a 3SAT 
formula tp in CNF with clauses C\, C2, . . . , over variables {xi, 22, ■ ■ ■ ,x n }, where each clause 
consists of disjunctions of exactly three literals (a literal is a variable or its complement). Given 
the formula ip, we construct a game graph as shown in Figure 2. The game graph is as follows: 
from the initial state, player 1 chooses a clause, then from a clause player 2 chooses a literal that 
appears in the clause (i.e., makes the clause true). From every literal the next state is the initial 
state. We now describe the multi-weight labeling function w. In the multi-weight function there is 
a component for every literal. For edges from the initial state to the clause states, and from the 
clause states to the literals, the weight for every component is 0. We now define the weight function 
for the edges from literals back to the initial state: for a literal y, and the edge from y to the initial 
state, the weight for the component of y is 1, the weight for the component of the complement of 
y is —1, and for all the other components the weight is 0. We now define a few notations related 
to assignments of truth values to literals. We consider assignments that assign truth values to all 
the literals. An assignment is valid if for every literal the truth value assigned to the literal and its 
complement are complementary (i.e., for all 1 < i < n, if Xi is assigned true (resp. false), then the 
complement Xi of Xi is assigned false (resp. true)). An assignment that is not valid is conflicting 
(i.e., for some 1 < i < n, both xi and gned the same truth value). If the formula ijj is 

satisfiable, then there is a valid assignment that satisfies all the clauses. If the formula ip is not 
satisfiable, then every assignment that satisfies all the clauses must be conflicting. We now present 
two directions of the hardness proof. 

ip satisfiable implies player 2 winning. We show that if ip is satisfiable, then player 2 has a mem- 
oryless winning strategy. Since ip is satisfiable, there is a valid assignment A that satisfies every 
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Fig. 2. Game graph construction for a 3SAT formula (Lemma 5). 



clause. The memoryless strategy is constructed from the assignment A as follows: for a clause Cj, 
the strategy chooses a literal as successor that appears in Cj and is set to true by the assignment. 
Consider an arbitrary strategy for player 1, and the infinite play: the literals visited in the play are 
all assigned truth values true by A, and the infinite play must visit some literal infinitely often. 
Consider the literal x that appears infinitely often in the play, then the complement literal x is 
never visited, and every time literal x is visited, the component corresponding to x decreases by 1, 
and since x appears infinitely often it follows that the play is winning for player 2 for every finite 
initial credit. It follows that the strategy for player 2 is winning, and the answer to the unknown 
initial credit problem is "No". 

ip not satisfiable implies player 1 is winning. We now show that if ip is not satisfiable, then player 1 
is winning. By determinacy, it suffices to show that player 2 is not winning, and by existence of 
memoryless winning strategy for player 2 (Lemma 2), it suffices to show that there is no memoryless 
winning strategy for player 2. Fix an arbitrary memoryless strategy for player 2, (i.e., in every clause 
player 2 chooses a literal that appears in the clause). If we consider the assignment A obtained 
from the memoryless strategy, then since ip is not satisfiable it follows that the assignment A is 
conflicting. Hence there must exist clause C, and Cj and variable x^ such that the strategy chooses 
the literal xt in G and the complement variable xt in Cj. The strategy for player 1 that at the 
starting state alternates between clause G and Cj, along with that the initial credit of 1 for the 
component of Xk and x~k, and for all other components, ensures that the strategy for player 2 
is not winning. Hence the answer to the unknown initial credit problem is Yes, and we have the 
desired result. □ 

Observe that our hardness proof works with weights restricted to the set {—1,0,1}. The results 
of [14] show that in two dimensions (k = 2) the unknown initial credit problem with weights in 
{— 1, 0, 1} can be solved in polynomial time. The complexity for fixed dimensions k > 3 is not known. 
With arbitrary integer weights, the unknown initial credit problem for k = 1 is in UP n coUP [3]. 

Complexity for memoryless strategies. We consider multi-energy games when player 1 is re- 
stricted to use memoryless strategies. The unknown initial credit problem for memoryless strategies 
is to decide, given a multi- weighted two-player game structure G, and an initial state Sq, whether 
there exist an initial credit vector vq € N fe and a memoryless winning strategy Ai for player 1 from 
so for the objective PosEnergy G (t)o). 

Theorem 4. The unknown initial credit problem for memoryless strategies is NP-complete. 



10 



Proof. The inclusion in NP is obtained as follows: the polynomial witness is the memoryless strategy 
for player 1, and once the strategy is fixed we obtain a game graph with choices for player 2 only 
The verification is to checks that for every dimension there is no negative cycle, and it can be 
achieved in polynomial time by solving one-dimensional energy games on graphs with choices for 
player 2 only [6, 3]. 

The NP hardness follows from a result of [13] where, given a directed graph and four vertices 
w, x, y, z, the problem of deciding the existence of two disjoint simple paths (one from w to x and 
the other from y to z) is shown to be NP-complete. Given such a graph and vertices, construct a 
one-player game by (1) adding the edges (x,y) with weight (n, —1) and (z, w) with weight (— l,n) 
(where n is the number of vertices in the graph), and (2) assigning all other edges of the graph 
the weight (—1, —1)- In the resulting one-player game, a winning memoryless strategy from w must 
induce a simple cycle through w, x, y, z to ensure nonnegative sum of weights in the two dimensions. 
This show that the unknown initial credit problem for memoryless strategies is at least as hard as 
the decision problem of [13], and thus NP-hard. The NP-completeness result follows. □ 

The reduction in the proof of Theorem 4 can be obtained with weights in {—1, 0, 1} by replacing 
the edges with weight n by a sequence of n edges with weight 1. The reduction remains polynomial. 
Theorem 4 shows NP-hardness for dimension k = 2 and weights in {—1,0,1}. For k = 1, the 
problem is solvable in polynomial time with weights in { — 1, 0, 1}, and for arbitrary integer weights, 
the problem is in UP n coUP [3, 5]. 

Equivalence with mult i- mean- payoff games under finite-memory strategies. We show 
that multi-mean-payoff games where the players are restricted to play finite-memory strategies 
are log-space equivalent to multi-energy games. The result of Lemma 6 shows that the unknown 
initial credit problem (for multi-energy games) and the mean-payoff threshold problem (with finite- 
memory strategies) are equivalent. 

Note that if the players use finite-memory strategies, then the outcome tt is ultimately periodic 
(a play tt = sosi . . . s n . . . is ultimately periodic if it can be decomposed as tt = p\ ■ p^ where p± 
and p2 are two finite sequences of states) and therefore, the value of MP(-7r) and M_P(7r) coincide. 
We denote by M ean Payoff G the set of ultimately periodic plays satisfying the multi-mean-payoff 
objective MeanPayofflnf G (or equivalently, satisfying MeanPayoffSup G ). 

Lemma 6. For all multi-weighted two-player game structures, the answer to the unknown initial 
credit problem is Yes if and only if the answer to the mean-payoff threshold problem under finite- 
memory strategies is Yes. 

Proof. Let G be multi-weighted two-player game structure of dimension k. First, assume that there 
exists a winning strategy Ai for player 1 in G for the energy objective PosEnergy G -(uo) (for some ^o). 
Theorem 2 establishes that finite memory is sufficient to win multi-energy games, so we can assume 
that Ai has finite memory. Consider the restriction of the graph G\ 1 to the reachable vertices, and 
we show that the energy vector of every simple cycle is nonnegative. By contradiction, if there exists 
a simple cycle with energy vector negative in one dimension, then the infinite path that reaches 
this cycle and loops through it forever would violate the objective PosEnergy G .(^o) regardless of the 
vector vq. Now, this shows that every reachable cycle in G\ 1 has nonnegative mean-payoff value in 
all dimensions, hence Ai is winning for the multi-mean-payoff objective M ean Payoff G . 

Second, assume that there exists a finite-memory strategy Ai for player 1 that is winning in G 
for the multi-mean-payoff objective MeanPayoffg-. By the same argument as above, all simple cycles 
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(0,0) 



Fig. 3. A multi-mean-payoff game where infinite memory is necessary to win (Lemma 7). 



in G\ 1 are nonnegative and the strategy Ai is also winning for the objective PosEnergy G (t;o) for 
some vq. Taking vq = {nW} k where n is the number of states in G\ 1 (which bounds the length of 
the acyclic paths) and W G Z is the largest weight in the game suffices. □ 

Note that the result of Lemma 6 does not hold for arbitrary strategies as shown in the following 
lemma. 

Lemma 7. In multi-mean-payoff games, in general infinite-memory strategies are required for win- 
ning (i.e., in general, finite-memory strategies are not sufficient for winning). 

Proof. The example of Fig. 3 shows a one-player game. We claim that (a) for MP , player 1 can 
achieve a threshold vector (1,1), and (b) for MP, player 1 can achieve a threshold vector (2,2); 
(c) if we restrict player 1 to use a finite-memory strategy, then it is not possible to win the multi- 
mean-payoff objective with threshold (1, 1) (and thus also not with (2,2)). To prove (a), consider 
the strategy that visits n times s a and then n times s&, and repeats this forever with increasing 
value of n. This guarantees a mean-payoff vector (1, 1) for MP because in the long-run roughly half 
of the time is spent in s a and roughly half of the time in s&. To prove (6), consider the strategy 
that alternates visits to s a and Sb such that after the nth alternation, the self-loop on the visited 
state s (s G {s a , Sb}) is taken so many times that the average frequency of s gets larger than - in 
the current finite prefix of the play. This is always possible and achieves threshold (2,2) for MP. 
Note that the above two strategies require infinite memory. To prove (c), recall that finite-memory 
strategies produce an ultimately periodic play and therefore MP and MP coincide. It is easy to see 
that such a play cannot achieve (1, 1) because the periodic part would have to visit both s a and 
Sb and then the mean-payoff vector (^1,^2) of the play would be such that v 1 + V2 < 2 and thus 
vi = V2 = 1 is impossible. □ 

Lemma 6 and Lemma 7 along with Theorem 2 give the following result. 

Theorem 5. Multi-mean-payoff games are determined under finite-memory, but not determined 
by finite-memory (i.e., winning strategies in general require infinite-memory, and determinacy and 
determinacy under finite-memory do not coincide). For multi-mean-payoff objectives (p we have 

^finite ^ = p))/™-m/^_ 

4 Multi-Mean-Payoff Games 

In this section we consider multi-mean-payoff games with infinite-memory strategies (we have al- 
ready shown in the previous section that multi-mean-payoff games with finite-memory strategies 
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coincide with multi-energy games). We present the following complexity results for the mean- 
payoff threshold problem: (1) NP n coNP for conjunction of MeanPayoffSup objectives; (2) coNP- 
completeness for conjunction of MeanPayofflnf objectives; and (3) coNP-completeness for conjunc- 
tion of mean-payoff-inf and mean-payoff-sup objectives. 

4.1 Conjunction of MeanPayoffSup objectives 

We consider multi- weighted two-player game structures with the multi-mean-payoff objective 
Mean PayoffS up G = {tt 6 Plays(G) | MP(7r) > (0,0, .. . ,0)}) for player 1. In general winning strate- 
gies for player 1 require infinite memory. We show that memoryless winning strategies exist for 
player 2 and we present a reduction of the decision problem for a conjunction of k mean-payoff-sup 
objectives to solving polynomially many instances of the decision problem for single mean-payoff- 
sup objective. As a consequence the decision problem for Mean PayoffS up G lies in NP n coNP, and 
we obtain a pseudo-polynomial time algorithm for this problem. 

In the next lemma we show that if player 1 can satisfy the MeanPayoffSup objective in ev- 
ery individual dimension from all states, then player 1 can satisfy the conjunctive MeanPayoffSup 
objective from all states. The converse holds trivially. The main idea of the proof is as follows: 
for each 1 < i < k, let A^ be a winning strategy for player 1 for the objective MeanPayoffSup^. 
Intuitively, the winning strategy for the conjunction of mean-payoff-sup objective plays X\, until 
the mean-payoff value on dimension i gets larger than a number very close to 0, and then switches 
to the strategy to ( ' mod etc. This way player 1 ensures nonnegative mean-payoff-sup value 

in every dimension. We present the proof formally below. While memoryless winning strategies 
exist for each individual dimension, we present a proof that does not use the assumption of witness 
memoryless winning strategies for individual dimensions. A similar proof technique is used later 
where memoryless winning strategies for each individual dimension are not guaranteed to exist. 

Lemma 8. If for all states s € S and for all 1 < i < k, player 1 has a winning strategy from s 
for the objective Mean PayoffS up^ = {tt £ Plays j (MP(-7r))j > 0} (player 1 has winning strategies for 
each individual dimension), then for all states s € S, player 1 has a winning strategy from s for 
the objective MeanPayoffSup = {tt € Plays | MP(7r) > (0,0, . . . ,0)}. 

Proof. For each s £ S and 1 < i < k, let \\(s) be a winning strategy for player 1 from s for the 
objective MeanPayoffSup^, and consider the strategy tree T X i^ s y For a > 0, we say that a node v of 
T A i ^ is an a-good node if the average of the weights of dimension i of the path from the root to v 

is at least —a. For Z € N, let T % d Z (s) be the tree obtained from ^ by removing all descendants 

of the a-good nodes that are at depth at least Z. Hence, all branches of T % d Z \s) have length at least 
Z, and the leaves are a-good nodes. 

We show that T l a ,z (s) is a finite tree. By Kdnig's Lemma [16], it suffices to show that every 
path in the tree T& Z (s) is finite. Assume towards contradiction that there is an infinite path tt 
in Ta Z (s). Then tt is a play consistent with A^(s), and since tt does not contain any a-good node 
beyond depth Z, the mean-payoff-sup value of tt in dimension i is at most —a, i.e., (MP(-7r))j < —a. 
This contradicts the assumption that X\(s) is a winning strategy for player 1 in dimension i. 

We now describe a strategy for player 1 based on the winning strategies of the individual 
dimensions and show that the strategy is winning for the conjunction of mean-payoff-sup objectives. 
Let W € N be the largest absolute value of the weight function w. 
1: a <- 1 
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2: loop 

3: for i = 1 to k do 

4: Let s be the current state, and L be the length of the play so far. 

6: Play according to X\(s) until a leaf of T„ (s) is reached. 
7: end for 
8: a <- § 
9: end loop 

After the last command in the internal for-loop was executed, the mean-payoff value in dimen- 
sion L is at least -L-W-mcx wh m > kK d thi j t 1 t -LW-m-a > _ 2 . 

Since Tq Z (s) is a finite tree, the main loop gets executed infinitely often (i.e., the strategy does 
not get stuck in the for-loop) and a tends to 0. Thus the supremum of the mean-payoff value is at 
least in every dimension. Hence the strategy described above is a winning strategy for player 1 
for Mean PayoffS up. □ 

In Lemma 8 the winning strategy constructed for player 1 requires infinite-memory, and by 
Lemma 7 infinite memory is required in general. For player 2, we show that memory less winning 
strategies exist, and we derive the algorithmic solution for the mean-payoff threshold problem. 

Lemma 9. In multi-mean-payoff games with conjunction of Mean PayoffS up objectives for player 1, 
memoryless strategies are sufficient for player 2. 

Proof. The proof is by induction on the number of states \S\ in the game structure. The base case 
with |5| = 1 is trivial. We now consider the inductive case with \S\ = n > 2. Let k € N be the 
dimension of the weight function w. For i = 1, . . . , k, let Wi = ((2))->MeanPayoffSupj be the winning 
region for player 2 for the one-dimensional mean-payoff game played in dimension i. (i.e., in Wi 
player 2 wins for the objective complementary to Mean PayoffS up i = {tt € Plays | (MP(7r))j > 0}). 
Let W = (Ji=i Wi. We consider the following two cases: 

1. If W = 0, then player 1 can satisfy the mean-payoff-sup objective in every dimension, and then 
by Lemma 8 player 1 wins from everywhere for the objective MeanPayoffSup = {tt £ Plays | 
MP(-7r) > (0, 0, ... , 0)}. Hence there is no winning strategy for player 2. 

2. If W 7^ 0, then there exists 1 < i < k such that Wi ^ 0. In Wi there is a memoryless winning 
strategy A2 for player 2 to falsify Mean PayoffS upj = {tt G Plays | (MP(7r))j > 0} since memoryless 
winning strategies exist for both players in mean-payoff games with single objective [11]. The 
strategy also falsifies MeanPayoffSup = {it € Plays | MP(7r) > (0, 0, ... , 0)}. 

Since Wi is a winning region for player 2, it follows that Wi = Attr2(VFj), and the graph G' 
induced by S \ Wi is a game structure. Let W' = W\ Wi be the winning region for player 2 in 
G' . By induction hypothesis (G' has strictly fewer states as a non-empty set Wi is removed), 
it follows that there is a memoryless winning strategy A2 in G' in the region W'. The winning 
region S \ (Wi U W') for player 1 in G' is also winning for player 1 in G (since Wi = Att^Wj), 
G' is obtained by removing only player 1 edges). Hence to complete the proof it suffices to show 
that the memoryless strategy obtained by combining A2 in Wi and A2 in W' is winning for 
player 2 from Wi U W' . Define the strategy A?S as follows: 

x*m-/ a 2( s ) ^seWi 
2{) ~\\' 2 (s) ii seW'. 
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Consider the memoryless strategy A| for player 2 and the outcome of any counter strategy for 
player 1 that starts in W' U Wi. There are two cases: (a) if the play reaches Wi, then it reaches 
in finitely many steps, and then A2 ensures that player 2 wins; and (b) if the play never reaches 
Wi, then the play always stays in G' , and now the strategy A' 2 ensures winning for player 2. 
This completes the proof of the second item. 

The desired result follows. □ 

Algorithm. We present Algorithm 1 to solve games with conjunction of mean-payoff-sup objec- 
tives. The algorithm maintains the current game structure Gcur 

induced by the current set of states 
S cur . In every iteration of the repeat-loop, for i = 1, . . . , k, we compute the winning region Wi for 
player 2 in the current game structure with the single mean-payoff objective on dimension i by a 
call to SolveSingleMeanPayoffSup(G cur , (w)i) which returns the winning region for player 1 in G cur 
for the objective M ea n Payoff Sup^. If Wi is nonempty, then we remove Wi from the current game 
structure and the iteration continues. 



Algorithm 1: SolveMeanPayoffSupGame 



Input : A game G with state space S and multi-weight function w. 

Output : The winning region of player 1 for objective MeanPayoffSup = f] 1<i<k MeanPayoffSup^. 
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until Losing StatesFound — false 
return S CU r 



end 



In every iteration the set of states removed from the game structure is certainly winning for 
player 2. In the end we obtain a game structure such that player 1 wins the mean-payoff objective 
in every individual dimension from all states, and by Lemma 8 it follows that the remaining region 
is winning for player 1. Thus game structures with conjunction of mean-payoff-sup objectives can 
be solved by 0(k ■ \S\) calls to solutions of mean-payoff games with single objective. The following 
theorem summarizes the results for multi- weighted games with conjunction of mean-payoff-sup 
objectives. 

Theorem 6. For multi-weighted two-player game structures with objective MeanPayoffSup = {tt € 
Plays I MP(-7r) > (0, 0, ... , 0)} for player 1, the following assertions hold: 

1. Winning strategies for player 1 require infinite-memory in general, and memoryless winning 
strategies exist for player 2. 
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2. The problem of deciding whether a given state is winning for player 1 lies in NP n coNP. 

3. The set of winning states for player 1 can be computed with k-\S\ calls to a procedure for solving 
game structures with single mean-payoff objective, hence in pseudo-polynomial time 0(k ■ \S\ 2 ■ 
\E\ - W). 

The results of Theorem 6 are proved as follows. Item 1 follows from Lemma 7 and Lemma 9. 
Item 3 follows from Algorithm 1 and the results of [5] where an algorithm is given for games with 
single mean-payoff objectives that works in time 0(|5| • \E\ ■ W). We now present the details of 
Item 2 in two parts. (1) (In NP). The NP algorithm guesses the winning region W for player 1, 
and a memoryless winning strategy for every individual dimension i (such memoryless winning 
strategies for every individual dimension exist by the results of [11]). The verification procedure 
checks in polynomial time that for every dimension i the set W is the winning set for player 1 in 
the graph G X i using the polynomial time algorithm of [15]. The correctness (that is, the existence 
of winning strategy in every individual dimension implies winning for the conjunction) follows 
from Lemma 8. (2) (In coNP). The coNP algorithm guesses a memoryless winning strategy A2 for 
player 2. The verification procedure needs to solve mean-payoff-sup objectives for the graph G\ 2 
and by Algorithm 1 this can be solved with k ■ \S\ calls to the polynomial time algorithm of [15] 
to solve graphs with single mean-payoff objectives. Thus we have the polynomial-time verification 
procedure, and the coNP complexity bound follows. 

4.2 Conjunction of MeanPayofflnf objectives 

We consider multi- weighted two-player game structures, and the multi-mean-payoff-inf objective 
MeanPayofflnf = {ir € Plays(G) | MP (-7r) > (0,0,... ,0)}) for player 1. In general winning strategies 
for player 1 require infinite memory (Lemma 7). We show that memoryless winning strategies exist 
for player 2, and the threshold problem is coNP-complete. 

Memoryless strategies for player 2. The objective for player 2 is the complementary objective 
of player 1. It follows from the results of [17] that memoryless winning strategies exist for player 2 
(see Appendix for discussion). 

Complexity. We show that the problem of deciding whether a given state is winning for player 1 
in multi- weighted game structures with conjunction of mean-payoff-inf objectives is coNP- complete. 
We first argue about the coNP lower bound. 

coNP lower bound. The proof is essentially the same as the proof of Lemma 5 and relies on the 
existence of memoryless winning strategies for player 2. We consider the hardness proof of Lemma 5 
and the reduction used in the lemma. If the formula is satisfiable, then consider the memoryless 
winning strategy for player 2 constructed from the satisfying assignment. Consider an arbitrary 
strategy (possibly with infinite-memory) for player 1. Since the strategy for player 2 is constructed 
from a non-conflicting assignment, it follows that conflicting literals do not appear. Within every 
three steps some literal is visited. If n is the number of variables, then in any play prefix compatible 
with the strategy of player 2, the frequency of the literal x with highest frequency in this prefix 
is at least jo^qrn ( an d note that the literal x has never appeared). It follows that the average of 
the weights in the dimension for x is at most — 3 .^ +1 ^ and therefore the mean-payoff-inf objective 
is violated in some dimension. Conversely, if the formula is not satisfiable, then against every 
memoryless strategy for player 2, the counter strategy constructed in Lemma 5 (that alternates 
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between the conflicting assignments) ensures that the mean-payoff-inf objective is satisfied. Hence 
the coNP-hardness follows. 

coNP upper bound. The rest of the section is devoted to proving the coNP upper bound. Once 
a memoryless strategy for player 2 is fixed (as a polynomial witness), we obtain a one-player game 
structure. To establish the coNP upper bound we need to show that the problem can be solved in 
polynomial time for one-player game structures. A polynomial-time algorithm for the problem is 
obtained by solving a variant of the zero circuit problem for multi-weighted directed graphs. The 
variant of the zero circuit problem is the nonnegative multi-cycle problem for directed graphs, where 
the multi-cycle is not required to be connected by edges as in the case of zero circuit problem. 

Nonnegative mult i- cycles. Let G = (V, E,w : E — >■ Z fc ) be a multi-weighted directed graph that 
is strongly connected. A multi-cycle is a multi-set of simple cycles. For a multi-cycle C we denote 
by SetCycle(C) the set of cycles that appear in C, and hence SetCycle(C) is a set of simple cycles. 
For multi-cycle C = {C\, . . . , C n } we denote with mi the number of occurrences of a simple cycle 
Cj in the multi-set C, and refer to rrtj as the factor of Cj. For a simple cycle C = (eo, e\. . . e n ), 
we denote w(C) = X^eec w ( e )- F° r a multi-cycle C, we denote w(C) = Ylc£C w (C) (note that in 
the summation a cycle C may appear multiple times in C, and alternatively the summation can 
be expressed as considering simple cycles Cj that appear in C and summing up rrii ■ w(Ci)). A 
nonnegative multi-cycle is a non-empty multi-set of simple cycles C such that w(C) > (i.e., in 
every dimension the weight is nonnegative). 

Lemma 10. Let G = (V, E,w : E — > 7 l k ) be a multi-weighted directed graph that is strongly 
connected. 

1. The problem of deciding if G has a nonnegative multi-cycle can be solved in polynomial time. 

2. If G does not have a nonnegative multi-cycle, then there exist a constant mQ G N and a real- 
valued constant cg > such that for all finite paths ^ in the graph G we have min{it?i(7r-' ) | 
i G {1, ... ,k}} < m G -CG-\TT f \. 

Proof. We prove both the items below. 

1. The proof of the first item is almost exactly as the proof of Theorem 2.2 in [18]. Given the 
directed strongly connected graph G = (V, E,w : E — >■ Z fc ), we consider a variable x e (for edge 
coefficient of e) for every e G E. We define the following set of linear constraints. 

(a) For v G V, let IN(v) be the set of all in-edges of v, and OUT(v) be the set of out-edges of 
v. For every v G V we define the linear constraint that Xlee/iVO) x e = J2eeO(JT(v) x ?- 

(b) For every e G E we define the constraint x e > 0. 

(c) For every dimension i G {1, . . . , k}, we define the constraint ^2 e€ E x e ' w i( e ) ^ 0- 

(d) Finally, we define the constraint J2e&E x e — 1- 

The first set of linear constraints is intuitively the flow constraints; the second constraint specifies 
that for every edge e, the edge coefficient x e is nonnegative; the third constraint specifies that 
in every dimension the sum of edge coefficient time the weights is nonnegative; and the last 
constraint ensures that at least one edge coefficient is strictly positive (to ensure that the multi 
cycle is non-empty). This set of constraints can be solved in polynomial time using standard 
linear programming algorithms. It essentially follows from [18] that this set of linear constraints 
has a solution iff a nonnegative multi-cycle exists. 
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2. Let 7T-^ be a finite path in G. The finite path can be decomposed into three paths iTq,itI,'k{ 

f f 

where 7Tq is an initial prefix of length at most |V|, tt c consists of cycles (not necessarily simple), 
and ir( is a segment of length at most |V| in the end. We can uniquely decompose tt c into a set 
C of multi-cycles and hence also into a set of simple cycles C = SetCycle(C) = {C\, . . . , C n }, 
for n < 2* E \, such that cycle Q occurs T{ times in 7r c , for some r, £ N. The sum of the weights 
in the part of ttI is 

n n n n 

W(TT[) = J^V; • V)(Ci) = C^Ti) -^-T^n 1 v ' w{d) < \tt[\ ■ ■ w(d). 

i=l i=l i=l ^i=l ^ i=l ^i=l n > 

The second equality is obtained by multiplying and dividing with (X^=i r *)> an d * ne inequality 
is obtained since (Y!i=l r i) — Kc I (as |ttc | = SiLi r « ' I C-i I ) - Let = , r .- ) and observe that 
A, • • • , fi n > with Y^j=i @j = 1- We first show the existence of a constant tjq > 0, such that 
for every oi\, . . . , a n > with X/?=i a i = 1j there exists a dimension i £ {1, . . . , /c} such that 

For every i £ {1, ...,fc}, we define a function fi(a±, . . . , a n ) = Y^=\ a j ' w i(Cj) and 
f(a±, . . . , a n ) = min{/j(ai, . . . , a n ) | 1 < i < k}. For every £ £ {1, . . . , k}, the function 
is continuous. Since / is the minimum of a finite number of continuous functions, / is also con- 
tinuous. Observe that [0, l] n n {(ai, . . . , a n ) \ Y^=i a j = 1} is a closed and bounded set. Hence 
by Weierstrass theorem the function / has a maxima Cf in this domain. Let a\, . . . , a* > such 
that /(«!, . . . , a* ) = Cf and Y^=i a *j = !• Assume towards contradiction that c/ > 0, we then 
show that the linear programming problem on the constraints mentioned above (in item 1) has a 
solution, which leads to a contradiction. For an edge e, we define the edge coefficient as follows: 
x e = E egC £c a j ^' e '' * ne sum °^ ^ ne a T s °^ c y c ^ e ^he edge belongs to). It follows that all 
the constraints are satisfied, and this contradicts the assumption that there is no nonnegative 
multi-cycle. Hence we have Cf < 0. Hence it follows that there exists a dimension i such that 

m(ir f ) < (|4l + kil) -W + Cf \4\ = (\4 1 + kil) ' W + + |vrf |) • (-c f ) + c f ■ \^\ 
< 2- |V| ■ (W-Cf) + c f ■ \irf\. 

Let rriQ = [2- \V\ • (W — Cf)~\ and r]Q = —Cf, and we obtain the desired result for the path 7rA Let 
C = {SetCycle(C) | C is a multi-cycle} be the set of simple cycles of all the multi-cycles of G. 
Note that C is a set whose elements are subsets of simple cycles, i.e., C is the power set of power 
set of simple cycles and hence \C\ < 2 2 ' E ' . By choosing mc = maxg eC tuq and cq = ming gC t]q 
we obtain the desired result. 

□ 

In sequel we abbreviate a maximal strongly connected component of a graph as a sec. 

Lemma 11. Let G be a multi-weighted one-player game structure, and let sq be the initial state. 
If there is a see C reachable from sq such that the multi-weighted directed graph induced by C 
has a nonnegative multi-cycle, then player 1 has a strategy to satisfy the mean-payoff-inf objective 
MeanPayofflnf. 

Proof. Let C be a sec reachable from sq such that the graph induced by C has a nonnega- 
tive multi-cycle. Then there exist simple cycles C\, . . . ,C n , factors m\, . . . ,m n and finite paths 



18 



i"i,2)7T2,3) • • • 1^-1,71)^,1 such that (i) the path nij is an acyclic path from Cj to Cj, and (ii) for 
every i = 1, . . . , k, we have X/j=i m i ' w i{Cj) > 0. An infinite memory strategy for player 1 is as 
follows: initialize Z = 1, and follow the steps below: 



1 


loop 




2 


Z ■ mi times m cycle 


C i 


3 


7Tl,2 




4 


Z ■ m2 times in cycle 




5 






6 






7 


Z • 777 n times in cycle 




8 


7Tn,l 




9 


Z <- Z + 1 




10 


end loop 





Let L = (vri^l + |-7T2,3 1 + • • • |7i"n-i,n| + Kn,l| be the sum of the lengths of the paths between cycles, 
and let P = \C\ \ + IC2I + • • • + |C n | be the sum of the lengths of the cycles. Note that both L and 
P are bounded by 2^1 • \S\ as 77 < 2^' and each path and cycle is of length at most \S\. Consider 
the steps executed in round Z + 1: the sum of weights due to executing the cycles in all previous 
rounds up to Z is nonnegative in all dimensions. Hence the sum of weights in any dimension, in 
the steps executed in round Z + 1 is at least 



-{\S\ + (Z +1) ■ P + Z ■ L + L) ■ W. 



The negative contribution can come from executing the initial prefix of length at most |5| to reach 
the sec, then the cycles in the present round (bounded by (Z + 1) • P steps) and the paths iTij of 
length at most L in the previous Z rounds and in the current round (in total bounded by Z ■ L + L 

steps). The number of steps executed so far is at least (L+P)-J2f=i i = (L+P)- - > "2 Z ■ 
Hence the average for all dimensions for all steps in round Z + 1 is at least 

-2 • (\S\ + (Z + 1) • P + Z ■ L + L) ■ W -2 • (\S\ + (Z + 1) • (P + L)) ■ W 



(L + P)- Z 2 (L + P)- Z 2 

-2-\S\-W -2 ■ (Z + 1) ■ W 
- Z 2 + Z 2 ' 

As Z — > 00, it follows that the mean-payoff-inf value is at least in every dimension, and hence 
the result follows. □ 



Lemma 12. Let G be a multi-weighted one-player game structure, and let sq be the initial state. 
If for every sec C reachable from so the multi-weighted directed graph induced by C does not have a 
nonnegative multi-cycle, then player 1 does not have strategy from sq to satisfy the mean-payoff-inf 
objective MeanPayofflnf . 

Proof. Consider an arbitrary strategy for player 1, and let the set of states visited infinitely often be 
contained in an sec C. Since C does not have a nonnegative multi-cycle it follows from Lemma 10(2) 
that every infinite path that visits states in C has a mean-payoff-inf value at most — c, for some 
c > 0, in some dimension. It follows the strategy for player 1 does not satisfy the mean-payoff-inf 
objective MeanPayofflnf. □ 
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The following lemma shows that in one-player game structure the MeanPayofflnf objective can 
be solved in polynomial time. To describe the precise complexity, let us denote by LP(i,j) the 
complexity to solve linear inequalities with i variables and j constraints. 

Lemma 13. Given a multi-weighted one-player game structure G and a state sq, the problem of 
deciding whether player 1 has a strategy for a mean-payoff-inf objective MeanPayofflnf from so can 
be solved in polynomial time (in time 0(\S\ + \E\) + LP(|i£|, |5| + \E\ + k + 1) ). 

Proof. It follows from Lemma 11 and Lemma 12 that an algorithm to solve the problem is as follows: 
consider the sec decomposition of the graph, and for every multi-weighted graph induced by an sec 
C reachable from sq check if the multi-weighted directed graph induced by C has a nonnegative 
multi-cycle (in polynomial time by Lemma 10(1)). Since sec decomposition is linear time (in time 
0(|<S| + \E\)) and the number of scc's is linear, we obtain the desired result. The complexity of the 
linear inequations follows from Lemma 10. □ 

Thus we obtain the desired coNP upper bound. We have the following theorem summarizing 
the result of this section. 

Theorem 7. For multi-weighted two-player game structures with objective MeanPayofflnf = {ir G 
Plays | MP (7r) > (0,0,..., 0)} for player 1, the following assertions hold: 

1. Winning strategies for player 1 require infinite-memory in general, and memoryless winning 
strategies exist for player 2. 

2. The problem of deciding whether a given state is winning for player 1 is coNP- complete. 

4.3 Conjunction of MeanPayofflnf and MeanPayoffSup objectives 

We consider multi- weighted two-player game structures, two sets /, J C {1, . . . , k}, and the multi- 
mean-payoff objective MeanPayofflnfSup(I, J) = {it £ Plays(G) | Vi G I : MR^); > and V j G J : 
MP(7r)j > 0} for player 1. 

Note that the problem is more general than the problem considered in the previous section (with 
J = we obtain MeanPayofflnf objectives, and with I = we obtain MeanPayoffSup objectives). 
Hence it follows that in general winning strategies for player 1 require infinite-memory, and the 
problem is coNP-hard. We show that memoryless winning strategies exist for player 2, and that 
the decision problem is coNP-complete. 

We start with the crucial result that considers the case when the mean-payoff-sup objective 
is required for one dimension, and for all the other dimensions the mean-payoff-inf objective is 
required. The lemma shows that if only one dimension is MeanPayoffSup objective, then it can be 
equivalently considered as MeanPayofflnf objective. 

Lemma 14. Let I = {1, . . . , k — 1} and s be a state. Player 1 has a winning strategy for the 
objective MeanPayofflnfSup(/, {k}) from s if and only if player 1 has a winning strategy for the 
objective MeanPayofflnf = MeanPayofflnfSup(7 U {A;},0) from s. 

Proof. To prove the lemma we show the following equivalent statement: Player 2 has a winning 
strategy to falsify MeanPayofflnfSup(J, {A;}) from s if and only if player 2 has a winning strategy to 
falsify MeanPayofflnf = MeanPayofflnfSup(J U [k},%) from s. 
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One direction is trivial as for any sequence (ui)i>o of real numbers we have limsupj^^ Ui > 
lim inf j^oo Ui , and hence it follows that a winning strategy for player 2 to falsify 
MeanPayofflnfSup(I, {i}) is also a winning strategy to falsify Mean Payoff Inf. 

Suppose that player 2 has a winning strategy for MeanPayofflnf , then by Theorem 7 player 2 
has a memoryless winning strategy A2 • Let G\ 2 be the one-player game structure obtained by fixing 
the strategy A2 for player 2. Since A2 is winning for player 2, it follows from Lemma 11 that in G\ 2 , 
for all scc's C, in the subgraph induced by C there is no nonnegative multi-cycle. It follows from 
Lemma 10 that there exist a constant w,g A2 £ N and a real-valued constant cq X2 > such that for 
all finite paths ir? in the graph G we have min{u>j(7r^) | i G {1, . . . ,k}} < mQ x — C G X2 ' l 71 ""^! - Let 
us denote c = cg X2 - We show that A2 is winning for player 2 (to falsify MeanPayofflnfSup(J, {&})). 
Consider a play it consistent with A2, and assume that MP(7r)fc > 0. Then the average payoff in 
dimension k is greater than — § in infinitely many positions (since the limit-superior is at least 0), 
and by Lemma 10 there is a dimension 1 < i < k with average payoff at most — c in infinitely many 
positions, thus MP.( 7r )i < 0. Hence either the supremum of the average weight in dimension k is 
negative, or the infimum of the average weight in one of the other dimensions is negative. In either 
case, the strategy A2 is winning for player 2. This completes the proof. □ 

Our goal is now to prove a result similar to Lemma 8 for MeanPayofflnfSup(J, J) objectives. To 
prove the result, we first prove two lemmas. The following lemma about MeanPayofflnf objectives 
is derived from the proof of Lemma 11 and it shows that if player 1 has a winning strategy for a 
mean-payoff-inf objective (with threshold in every dimension), then for every a > there is a 
finite-memory strategy to ensure mean-payoff-inf value of at least —a in every dimension. Lemma 16 
will be a consequence of Lemma 15. 

Lemma 15. Let G be a multi-weighted two-player game structure, and let sq be the initial state. 
If there is a winning strategy for player 1 for the objective MeanPayofflnf = {ir £ Plays(G) | VI < 
i < k. ( MP ( 7r ))i — 0}> then for all a > there is a finite-memory winning strategy for player 1 to 
ensure the objective MeanPayofflnf (-a) = {ir G Plays(G) | VI < i < k. (MP(7r)),; > —a}. 

Proof. Since against finite-memory strategies for player 1 memoryless winning strategies exist for 
player 2 (Lemma 6 and Lemma 2) and multi-mean-payoff games are determined under finite memory 
(Theorem 5) to prove that finite-memory winning strategies exist for player 1 for the objective 
MeanPayofflnf (—a) we show that against every memoryless strategy for player 2 there exists a 
finite- memory winning strategy for player 1. Consider a memoryless strategy for player 2 and the 
one-player game structure obtained after fixing the strategy. By Lemma 12, since player 1 satisfies 
the MeanPayofflnf objective, there must be a sec C reachable from sq (within | jS' | steps) such that 
the graph induced by C has a nonnegative multi-cycle. Then there exist simple cycles C±, . . . ,C n , 
factors mi, . . . , m n and finite paths ^1^,^2,3, . . . , 7r n _i in , 7r nj i such that: 

1. the path 7Tjj is a path between C{ to Cj with length at most \S\. 

2. For every i = 1, . . . , k, we have ^j=i m j ' w i{Cj) > 

A finite memory strategy for player 1 is as follows: for large enough Z, follow the steps below: 

1: loop 

2: Z ■ mi times in cycle C\ 

3: 71-1,2 

4: Z ■ m-2 times in cycle C2 
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5: 7r 2i 3 
6: 

7: Z ■ m n times in cycle C n 

8: 7T nj l 

9: end loop 

In contrast with the strategy of Lemma 11, the above strategy plays the same in every round- 
but for large enough Z, thus it can be implemented with finite memory. Let L = \it\ y 2\ + K2,3| + 

• • • Kn-i.nl + Kn,i| be the sum of the lengths of the paths between cycles, and let M = \C\ \ + | C2 1 + 

• • • + \C n \ be the sum of the lengths of the cycles. Note that both L and M are bounded by 2^' • \S\ 
as n < 2^1 and each path and cycle is of length at most \S\. Consider the steps executed in round 
i: the sum of weights due to executing the cycles in all previous rounds up to Z is nonnegative in 
all dimensions. Hence the sum of weights in any dimension, in the steps executed in round i is at 
least 

-(|5| + Z-M + i-L + L) ■ W. 

The argument is as in Lemma 11. The number of steps executed so far is at least (L + M) -(i-l)-Z. 
Hence the average for all dimensions for all steps in round i is at least 

{\S\ + (i + l)-L + Z -M)-W „ ( \S\-W 2 ■ W W \ 
(L + M)-{i-l)-Z " " ^ Z + ~Z~ + (i^i) ) ' 

for i > 3. With Z large enoug h [Z > (\ s \ +2 > w ), it follows that as i — > 00, the mean-payoff-inf value 
is at least —a in every dimension, and hence the result follows. □ 

Lemma 16. Let G be a multi-weighted two-player game structure, and let sq be the initial state. 
If there is a winning strategy for player 1 for the objective MeanPayofflnf = {tt G Plays(G) | VI < 
i < k. ( MP (7r))i > 0}, then for all a > there is a finite-memory winning strategy A and a number 
N a ,\,so su °h that against all strategies of player 2 and for all n G N the sum of weights after n steps 
is at least —{N a \ SQ + n) • a in every dimension, i.e., the average of the weights is at least —2 ■ a 
once n > N a> \ jSo . 

Proof. Fix a finite-memory strategy A for player 1 to satisfy the objective MeanPayofflnf (—a) = 
{tt G Plays(G) | VI < i < k. ( MP (7r))j > —a} (such a strategy exists by Lemma 15). Let M be the 
size of the memory. In the game structure obtained by fixing the strategy, in all cycles the average 
of the weights in every dimension is at least —a. For any path it can be decomposed into initial 
prefix and a cycle free segment in the end (each of length at most M • \S\), and the other part is 
decomposed into cycles (not necessarily simple cycles) (as done in Lemma 10). The initial prefix 
and trailing prefix is of length at most M ■ \ S\ and the sum of the weights is at least — 2 • M ■ \ S\ ■ W. 
Hence choosing N a> \ iSo > 2 M ijjp w proves the desired result. □ 

Lemma 17. Let G be a multi-weighted game structure with multi-mean-payoff objective 
MeanPayofflnfSup(7, J) = {vr G Plays(G) | Vi G I : MPJ>); > andVj G J : MP^vr^ > 0} for 
player 1. For £ G J, let <L>i = MeanPayofflnfSup(/, {£}) denote the objective that requires to satisfy 
all MeanPayofflnf objectives and the MeanPayoffSup objective in dimension t. If for all states s £ S 
and for all £ G J, player 1 has a winning strategy from s for the objective <Pi, then for all states 
s G S, player 1 has a winning strategy from s for the objective MeanPayofflnfSup(J, J). 
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The key idea of the proof is similar to Lemma 8 and we use Lemma 15 (details are presented 
below for completeness). For all s G S and all £ 6 J, let \±(s) be a winning strategy from s 
for player 1 for the objective 4>g. Intuitively, the winning strategy for the conjunction of mean- 
payoff objectives plays A^(-) until the mean-payoff value in dimension £ gets very close to 0, and 
then switches to a strategy for another value of £ € J. Thus player 1 ensures nonnegative mean- 
payoff value in every dimension, with mean-payoff-inf in dimensions of / and mean-payoff-sup in 
dimensions of J. 



Proof. Let a > 0, and s be the initial state. Let = {vr € Plays(G) | Mi € / : (MPJ>))i > 

— ^ and (MP(tt))i > — §}• Let \\ a (s) be a finite-memory winning strategy for player 1 for the 
objective <I>i{— ^) with the initial state s (the existence of finite-memory winning strategy for 
f ) follows from Lemma 14 and Lemma 15). For Z € N, consider the tree T^e,z^ defined as 

follows. Let T A <? ^ be the strategy tree for \\ a (s) with initial state s. We say that a node v of 
T A £ ^ is an a- good node if the average of the weights in all dimensions in / and dimension £ of the 

path from the root to v is at least —a. The tree T x e,z^ is obtained from T A <? ^ by removing all 

descendants of a-good nodes that are at depth at least Z. Hence, the leaves of T^z^ are a-good. 

We show that T^t,z^ is a finite tree. By Konig's Lemma [16], it suffices to show that every path 

in the tree is finite. Assume towards contradiction that there is an infinite path tt in the tree. Hence 
7r is a play consistent with A( a (s), and since 7r does not contain any a-good node, it follows that 
for some dimension i € / U {£} we have (MP(7r))j < —a (and (MP_( 7r ))i < ~ a as well). It follows 
that 7r ^ &t(— §)• This contradicts the assumption that A^ a (s) is a winning strategy for player 1 
for #/(-§). 

We now describe a strategy for player 1 based on the finite-memory winning strategies for 
<Pi(— §) and show that the strategy is winning for the objective MeanPayofflnfSup(I, J). 

1: Ol i — 1 
2: loop 

3: ior £ £ J do 

4: Let s be the current state, and L be the play length so far. 

5: Z <- max{^,iVS} (where iVS = max{iV Q T , . sgS.^'g J,A(s) = Af a (a)}, that is, 

A(s) = A^ « (s) is the finite-memory strategy for #^/(— ^) from s, the number iVa A ^ g is 
as defined in Lemma 16 for the strategy, and N% is the maximum over £' € J) 

^ 2 

6: Play according to X[ a (s) until a leaf s' of T x i,z^ is reached. 
7: end for 
8: a 4- § 
9: end loop 

Let 6 N be the largest absolute value of the weight function w. After the last command 
in the internal for-loop was executed, the mean-payoff value in dimension £, is at least ~ L 2+z^ ~ 
where Z > lt-?L and this is at least 



-L-W- 
L + 



- a 
TW 



L-W 

a 



> -2-a. 
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Consider the segment of the play for the round for a value of a: let us denote by Mb the number 
of steps played till the beginning of the round and we will denote by M% the total number of steps 
of the current round. Our goal is to obtain an upper bound on the average of the weights for all 
n < Mb + Mf. In the beginning of the round (i.e., after Mb steps) the average value for dimensions 
in / is at least —2 • a (recall that a has been halved in line 8). Step 5 ensures that at least N* 
steps have been already played, i.e., Mb > N*. It follows from Lemma 16 that for all dimensions 
in I and for all steps Mb < n < Mb + M± of the current round, the sum of the weights is at least 
— {Mb ■ 2 • a + N* ■ a + (n — Mb) ■ a), and hence the average value at step n is at least 

-{Mb ■ 2 • a + N* ■ a + (n - M b ) ■ a) 

— > —4 • a 

n 

since n > Mb and n > N*. That is, for all steps in the round for a, for all dimensions in /, 
the average value is at least —4 • a. In every external for-loop a gets smaller, and L gets bigger. 
Moreover, since the tree T x e,z^ is finite, it follows that the main loop gets executed infinitely often 

(i.e., the strategy does not get stuck in the for-loop). Thus when the length of the play tends to 
infinity, the supremum of the mean-payoff value tends to a value at least in every dimension 
j 6 J, and the infimum of the mean-payoff value tends to a value at least in every dimension 
i € /. Hence the strategy described above is a winning strategy for player 1. □ 

Lemma 18. In multi-mean-payoff games with objective MeanPayofflnfSup(I, J) for player 1, mem- 
oryless strategies are sufficient for player 2. 

Proof. The proof is similar to the proof of Lemma 9, and based on induction on the number of states 
IS") in the game structure. The base case with \S\ = 1 is obvious. We now consider the inductive 
case with \S\ = n > 2. For £ £ J, let Wi be the winning region for player 2 for the objective <p£ as 
defined in Lemma 17. Let W = IJ^e j We consider the following two cases: 

1. If W = 0, then player 1 can satisfy the objective <£>£ for all I € J, and then by Lemma 17 
player 1 wins from everywhere for the objective MeanPayofflnfSup(/, J). Hence there is no win- 
ning strategy for player 2. 

2. If W 7^ 0, then there exists I € J such that Wi ^ 0. In Wi there is a memoryless winning 
strategy A2 for player 2 to falsify <p£, and the strategy also falsifies MeanPayofflnfSup(/, J) as 
MeanPayofflnfSup(7, J) = f\ej^- The existence of memoryless winning strategy for player 2 
follows from the following facts: by Lemma 14 it follows that if player 2 can falsify the objec- 
tive then player 2 can also falsify the objective where in the dimension I we consider the 
mean-payoff-inf objective instead of mean-payoff-sup objective, and the existence of memoryless 
strategies against mean-payoff-inf objectives follows from Theorem 7. The rest of the proof is 
identical to the proof of Lemma 9 and can be omitted (we present it for sake of completeness). 
Since Wi is a winning region for player 2 it follows that Wi = Att^W^), and hence the graph 
G' induced by 5 \ Wi is a game structure. Let W' = W\ Wi be the winning region for player 2 
in G'. By inductive hypothesis (since G' has strictly fewer states as a non-empty set Wi is 
removed), it follows that there is a memoryless winning strategy X' 2 in G' for the region W'. 
The winning region S \ (Wi U W') for player 1 in G' is also winning for player 1 in G (since 
Wi = Attr2(W^), G' is obtained by removing only player 1 edges). Hence to complete the proof 
it suffices to show that the memoryless strategy obtained by combining A2 in Wf, and A 2 in W' 
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is winning for player 2 from Wi U W . Define the strategy A2 as follows: 



ASM = { 



A 2 (s) 
A' 2 (fl) 



if s G Wi 
if s G W. 



Consider the memoryless strategy A2 for player 2 and the outcome of any counter strategy for 
player 1 that starts in W' U Wg. There are two cases: (a) if the play reaches Wg, then it reaches 
in finitely many steps, and then A2 ensures that player 2 wins; and (b) if the play never reaches 
Wi, then the play always stays in G', and now the strategy A 2 ensures winning for player 2. 



coNP upper bound. Since memoryless winning strategies exist for player 2, to establish the 
coNP upper bound we need to show that one-player game structures with MeanPayofflnfSup(J, J) 
objectives can be solved in polynomial time. First we interpret MeanPayofflnfSup(7, J) as the con- 
junction of for £ £ J. From Lemma 14 it follows every <p£ can be considered as MeanPayofflnf 
objective and hence can be solved in polynomial time for one-player game structures by the results 
of Section 4.2. Hence the coNP upper bound follows. We have the following theorem summarizing 
the results of this section. 

Theorem 8. For multi-weighted two-player game structures with objective 
MeanPayofflnfSup(/, J) = {vr G Plays(G) | Vt G I : MP(n)i > and Vj G J : MP(tt)j > 0} 
for player 1, the following assertions hold: 

1. Winning strategies for player 1 require infinite-memory in general, and memoryless winning 
strategies exist for player 2. 

2. The problem of deciding whether a given state is winning for player 1 is coNP- complete. 

5 Conclusion 

In this work we considered games with multiple mean-payoff and energy objectives, and estab- 
lished determinacy under finite-memory, inter-reducibility of these two classes of games for finite- 
memory strategies, and improved the complexity bounds from EXPSPACE to coNP-complete. We 
also showed that multi-energy and multi-mean-payoff games under memoryless strategies are NP- 
complete. Finally, we studied multi-mean-payoff games with infinite-memory strategies and show 
that multi-mean-payoff games with mean-payoff-sup objectives can be decided in NP n coNP (and 
can be solved in polynomial time if mean-payoff games with single objective can be solved in poly- 
nomial time); and multi-mean-payoff games with mean-payoff-inf objectives, and combination of 
mean-payoff-inf and mean-payoff-sup objectives are coNP-complete. Thus we present optimal com- 
putational complexity results for multi-energy and multi-mean-payoff games under finite-memory, 
memoryless, and infinite-memory strategies. 
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Appendix 



We discuss the results of [17] which shows the existence of memoryless winning strategies for player 2 
when the objective for player 1 is the conjunction of mean-payoff-inf objectives. We will also argue 
that the results of [17] do not show the existence of memoryless winning strategies for player 2 
when the objective for player 1 is the conjunction of mean-payoff-sup objectives (the result that 
we establish in Lemma 9). The result of [17] requires the notion of convexity for prefix-independent 
objectives. 

Prefix-independent and convex objectives. An objective (p is prefix-independent if for all plays 
7T and 7r' such that tt' = p ■ tt, where p is a finite prefix, we have tt € <p iff tt' £ ip, i.e., the objective 
is independent of finite prefixes. A play tt is a combination of two plays i\\ = uiu^u^ . . . and TT2 = 
U0U2U4 . . ., where u^s are finite prefixes, if tt = UQU1U2U3U4 .... An objective ip is convex if it is closed 
under combination. We refer the reader to [17] for further details. The results of [17] shows that 
if the objective for player 1 is prefix-independent and convex, then memoryless winning strategies 
exist for player 2. It is easy to verify that mean-payoff-inf objectives are both prefix-independent 
and convex. It follows that conjunction of mean-payoff-inf objectives are also prefix-independent 
and convex. Hence in games with conjunction of mean-payoff-inf objectives, memoryless winning 
strategies exist for player 2. We now show with an example that in contrast mean-payoff-sup 
objectives are not convex. 

Example 1. Consider a one-player game structure G with two states {s + , s_}, with all edges, such 
that all incoming edges to state s_|_ have weight +2, and all incoming edges to s_ have weight —2. 
Consider the following play tto: 

1. Step 1. Repeat the self-loop in s_ until the average weight of the play prefix is below —1, then 
take edge to s+ and goto Step 2. 

2. Step 2. Repeat the self-loop in s + until the average weight of the play prefix is above 1, then 
take edge to s_ and goto Step 1. 

Consider the play ix\ obtained by exchanging s+ and s_ in ttq. It is easy to verify that MP(7To) = 
MP(7Ti) = +1. However, for the following combination of the plays 712, such that forall i > the 
2i — 1-th state of 1x2 is the i-th state of ttq and the 2z-th state of 1T2 is the i-th state of ir\. We get 
that MP(7T2) = 0. It follows that mean-payoff-sup objectives are not convex. 



27 



